project team
Biothreat Benchmark Generation Framework for Evaluating Frontier AI Models III: Implementing the Bacterial Biothreat Benchmark (B3) Dataset
Ackerman, Gary, Wilson, Theodore, Kallenborn, Zachary, Shoemaker, Olivia, Wetzel, Anna, Peterson, Hayley, Danfora, Abigail, LaTourette, Jenna, Behlendorf, Brandon, Clifford, Douglas
The potential for rapidly-evolving frontier artificial intelligence (AI) models, especially large language models (LLMs), to facilitate bioterrorism or access to biological weapons has generated significant policy, academic, and public concern. Both model developers and policymakers seek to quantify and mitigate any risk, with an important element of such efforts being the development of model benchmarks that can assess the biosecurity risk posed by a particular model. This paper discusses the pilot implementation of the Bacterial Biothreat Benchmark (B3) dataset. It is the third in a series of three papers describing an overall Biothreat Benchmark Generation (BBG) framework, with previous papers detailing the development of the B3 dataset. The pilot involved running the benchmarks through a sample frontier AI model, followed by human evaluation of model responses, and an applied risk analysis of the results along several dimensions. Overall, the pilot demonstrated that the B3 dataset offers a viable, nuanced method for rapidly assessing the biosecurity risk posed by a LLM, identifying the key sources of that risk and providing guidance for priority areas of mitigation priority.
Biothreat Benchmark Generation Framework for Evaluating Frontier AI Models II: Benchmark Generation Process
Ackerman, Gary, Kallenborn, Zachary, Wetzel, Anna, Peterson, Hayley, LaTourette, Jenna, Shoemaker, Olivia, Behlendorf, Brandon, Almakki, Sheriff, Clifford, Doug, Sheinbaum, Noah
The potential for rapidly-evolving frontier artificial intelligence (AI) models, especially large language models (LLMs), to facilitate bioterrorism or access to biological weapons has generated significant policy, academic, and public concern. Both model developers and policymakers seek to quantify and mitigate any risk, with an important element of such efforts being the development of model benchmarks that can assess the biosecurity risk posed by a particular model. This paper, the second in a series of three, describes the second component of a novel Biothreat Benchmark Generation (BBG) framework: the generation of the Bacterial Biothreat Benchmark (B3) dataset. The development process involved three complementary approaches: 1) web-based prompt generation, 2) red teaming, and 3) mining existing benchmark corpora, to generate over 7,000 potential benchmarks linked to the Task-Query Architecture that was developed during the first component of the project. A process of de-duplication, followed by an assessment of uplift diagnosticity, and general quality control measures, reduced the candidates to a set of 1,010 final benchmarks. This procedure ensured that these benchmarks are a) diagnostic in terms of providing uplift; b) directly relevant to biosecurity threats; and c) are aligned with a larger biosecurity architecture permitting nuanced analysis at different levels of analysis.
- North America > United States (0.14)
- Asia > North Korea (0.04)
- Asia > China > Hong Kong (0.04)
- Africa > Cameroon > Gulf of Guinea (0.04)
- Workflow (0.94)
- Research Report > Experimental Study (0.68)
- Health & Medicine (1.00)
- Government > Military (1.00)
- Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.93)
- (2 more...)
Agricultural Industry Initiatives on Autonomy: How collaborative initiatives of VDMA and AEF can facilitate complexity in domain crossing harmonization needs
Happich, Georg, Grever, Alexander, Schöning, Julius
The agricultural industry is undergoing a significant transformation with the increasing adoption of autonomous technologies. Addressing complex challenges related to safety and security, components and validation procedures, and liability distribution is essential to facilitate the adoption of autonomous technologies. This paper explores the collaborative groups and initiatives undertaken to address these challenges. These groups investigate inter alia three focal topics: 1) describe the functional architecture of the operational range, 2) define the work context, i.e., the realistic scenarios that emerge in various agricultural applications, and 3) the static and dynamic detection cases that need to be detected by sensor sets. Linked by the Agricultural Operational Design Domain (Agri-ODD), use case descriptions, risk analysis, and questions of liability can be handled. By providing an overview of these collaborative initiatives, this paper aims to highlight the joint development of autonomous agricultural systems that enhance the overall efficiency of farming operations.
- Europe > Germany (0.05)
- North America > United States (0.04)
Enhancing Team Diversity with Generative AI: A Novel Project Management Framework
This research-in-progress paper presents a new project management framework that utilises GenAI technology. The framework is designed to address the common challenge of uniform team compositions in academic and research project teams, particularly in universities and research institutions. It does so by integrating sociologically identified patterns of successful team member personalities and roles, using GenAI agents to fill gaps in team dynamics. This approach adds an additional layer of analysis to conventional project management processes by evaluating team members' personalities and roles and employing GenAI agents, fine-tuned on personality datasets, to fill specific team roles. Our initial experiments have shown improvements in the model's ability to understand and process personality traits, suggesting the potential effectiveness of GenAI teammates in real-world project settings. This paper aims to explore the practical application of AI in enhancing team diversity and project management
DaCapo: a modular deep learning framework for scalable 3D image segmentation
Patton, William, Rhoades, Jeff L., Zouinkhi, Marwan, Ackerman, David G., Malin-Mayor, Caroline, Adjavon, Diane, Heinrich, Larissa, Bennett, Davis, Zubov, Yurii, Team, CellMap Project, Weigel, Aubrey V., Funke, Jan
DaCapo is a specialized deep learning library tailored to expedite the training and application of existing machine learning approaches on large, near-isotropic image data. In this correspondence, we introduce DaCapo's unique features optimized for this specific domain, highlighting its modular structure, efficient experiment management tools, and scalable deployment capabilities. We discuss its potential to improve access to large-scale, isotropic image segmentation and invite the community to explore and contribute to this open-source initiative.
A Practical Multilevel Governance Framework for Autonomous and Intelligent Systems
Pöhler, Lukas D., Diepold, Klaus, Wallach, Wendell
Autonomous and intelligent systems (AIS) facilitate a wide range of beneficial applications across a variety of different domains. However, technical characteristics such as unpredictability and lack of transparency, as well as potential unintended consequences, pose considerable challenges to the current governance infrastructure. Furthermore, the speed of development and deployment of applications outpaces the ability of existing governance institutions to put in place effective ethical-legal oversight. New approaches for agile, distributed and multilevel governance are needed. This work presents a practical framework for multilevel governance of AIS. The framework enables mapping actors onto six levels of decision-making including the international, national and organizational levels. Furthermore, it offers the ability to identify and evolve existing tools or create new tools for guiding the behavior of actors within the levels. Governance mechanisms enable actors to shape and enforce regulations and other tools, which when complemented with good practices contribute to effective and comprehensive governance.
- North America > Canada > Quebec > Montreal (0.04)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
- Europe > Sweden > Stockholm > Stockholm (0.04)
- (7 more...)
- Information Technology > Security & Privacy (1.00)
- Health & Medicine (1.00)
- Government > Military (0.68)
- Law > International Law (0.67)
AI Ethics and Governance in Practice: An Introduction
Leslie, David, Rincon, Cami, Briggs, Morgan, Perini, Antonella, Jayadeva, Smera, Borda, Ann, Bennett, SJ, Burr, Christopher, Aitken, Mhairi, Katell, Michael, Fischer, Claudia
AI systems may have transformative and long-term effects on individuals and society. To manage these impacts responsibly and direct the development of AI systems toward optimal public benefit, considerations of AI ethics and governance must be a first priority. In this workbook, we introduce and describe our PBG Framework, a multi-tiered governance model that enables project teams to integrate ethical values and practical principles into their innovation practices and to have clear mechanisms for demonstrating and documenting this.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Asia > Malaysia (0.04)
- North America > Canada > Ontario > Toronto (0.04)
- (2 more...)
- Research Report (0.82)
- Instructional Material > Course Syllabus & Notes (0.46)
- Information Technology > Data Science > Data Mining (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
- Information Technology > Artificial Intelligence > Applied AI (1.00)
AI Fairness in Practice
Leslie, David, Rincon, Cami, Briggs, Morgan, Perini, Antonella, Jayadeva, Smera, Borda, Ann, Bennett, SJ, Burr, Christopher, Aitken, Mhairi, Katell, Michael, Fischer, Claudia, Wong, Janis, Garcia, Ismael Kherroubi
Reaching consensus on a commonly accepted definition of AI Fairness has long been a central challenge in AI ethics and governance. There is a broad spectrum of views across society on what the concept of fairness means and how it should best be put to practice. In this workbook, we tackle this challenge by exploring how a context-based and society-centred approach to understanding AI Fairness can help project teams better identify, mitigate, and manage the many ways that unfair bias and discrimination can crop up across the AI project workflow. We begin by exploring how, despite the plurality of understandings about the meaning of fairness, priorities of equality and non-discrimination have come to constitute the broadly accepted core of its application as a practical principle. We focus on how these priorities manifest in the form of equal protection from direct and indirect discrimination and from discriminatory harassment. These elements form ethical and legal criteria based upon which instances of unfair bias and discrimination can be identified and mitigated across the AI project workflow. We then take a deeper dive into how the different contexts of the AI project lifecycle give rise to different fairness concerns. This allows us to identify several types of AI Fairness (Data Fairness, Application Fairness, Model Design and Development Fairness, Metric-Based Fairness, System Implementation Fairness, and Ecosystem Fairness) that form the basis of a multi-lens approach to bias identification, mitigation, and management. Building on this, we discuss how to put the principle of AI Fairness into practice across the AI project workflow through Bias Self-Assessment and Bias Risk Management as well as through the documentation of metric-based fairness criteria in a Fairness Position Statement.
- Europe > United Kingdom > Wales (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (9 more...)
- Workflow (1.00)
- Research Report > Experimental Study (1.00)
- Instructional Material > Course Syllabus & Notes (0.67)
- Law > Civil Rights & Constitutional Law (1.00)
- Information Technology > Security & Privacy (1.00)
- Health & Medicine > Therapeutic Area > Oncology (1.00)
- (10 more...)
AI Sustainability in Practice Part One: Foundations for Sustainable AI Projects
Leslie, David, Rincon, Cami, Briggs, Morgan, Perini, Antonella, Jayadeva, Smera, Borda, Ann, Bennett, SJ, Burr, Christopher, Aitken, Mhairi, Katell, Michael, Fischer, Claudia, Wong, Janis, Garcia, Ismael Kherroubi
Sustainable AI projects are continuously responsive to the transformative effects as well as short-, medium-, and long-term impacts on individuals and society that the design, development, and deployment of AI technologies may have. Projects, which centre AI Sustainability, ensure that values-led, collaborative, and anticipatory reflection both guides the assessment of potential social and ethical impacts and steers responsible innovation practices. This workbook is the first part of a pair that provides the concepts and tools needed to put AI Sustainability into practice. It introduces the SUM Values, which help AI project teams to assess the potential societal impacts and ethical permissibility of their projects. It then presents a Stakeholder Engagement Process (SEP), which provides tools to facilitate proportionate engagement of and input from stakeholders with an emphasis on equitable and meaningful participation and positionality awareness.
- Europe > United Kingdom (0.92)
- North America > United States > District of Columbia > Washington (0.14)
- North America > Canada > Quebec > Montreal (0.04)
- (7 more...)
- Research Report (1.00)
- Instructional Material > Course Syllabus & Notes (0.92)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
- Information Technology > Artificial Intelligence > Applied AI (1.00)
Data-Driven Risk Modeling for Infrastructure Projects Using Artificial Intelligence Techniques
Managing project risk is a key part of the successful implementation of any large project and is widely recognized as a best practice for public agencies to deliver infrastructures. The conventional method of identifying and evaluating project risks involves getting input from subject matter experts at risk workshops in the early phases of a project. As a project moves through its life cycle, these identified risks and their assessments evolve. Some risks are realized to become issues, some are mitigated, and some are retired as no longer important. Despite the value provided by conventional expert-based approaches, several challenges remain due to the time-consuming and expensive processes involved. Moreover, limited is known about how risks evolve from ex-ante to ex-post over time. How well does the project team identify and evaluate risks in the initial phase compared to what happens during project execution? Using historical data and artificial intelligence techniques, this study addressed these limitations by introducing a data-driven framework to identify risks automatically and to examine the quality of early risk registers and risk assessments. Risk registers from more than 70 U.S. major transportation projects form the input dataset.
- Asia > Middle East > UAE (0.14)
- North America > United States > Washington (0.04)
- North America > United States > Montana (0.04)
- (12 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Transportation > Infrastructure & Services (1.00)
- Law (1.00)
- Information Technology (1.00)
- (5 more...)
- Information Technology > Communications > Social Media (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.69)